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© Data compression system with expansion protection. 



© A data compression system implementing ex- 
pansion protection employs one or more pairs of 
FIFOs to compare the lengths of raw and processed 
versions of a block of received data. A data com- 
pression system includes a data compressor (11), a 
controller (19), a FIFO (13) for compressed data, and 
another FIFO (15) for uncompressed data. The 
FIFOs (13 and 15) are used to compare the length of 
a data block processed by the data compressor (1 1 ) 
with the raw version of the data block. The shorter 
version is transmitted so that the data transmitted by 
the data compression system is at most negligibly 
expanded relative to the system input. A code injec- 
^tor (17) inserts a code into the output stream to 
indicate the beginning of the transmission of a raw 
data block so that a receiving or retrieving system 
can determine whether the data following needs to 
be decompressed or not. Further codes can be 
^injected to indicate a switch from raw data to pro- 
(V)cessed data in the output of the compression sys- 
tern. 
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DATA COMPRESSION SYSTEM WITH EXPANSION PROTECTION 



BACKGROUND OF THE INVENTION 



The present invention relates to signal process- 
ing and, more particularly, to data compression. 

Data compression is the reversible re-encoding 
of information into a more compact expression. 
This more compact expression permits information 
to be stored and/or communicated more efficiently, 
generally saving both time and expense. A typical 
encoding scheme, e.g., based on ASCII, encodes 
alphanumeric characters and other symbols into 
binary sequences. A major class of compression 
schemes encodes symbol combinations using bi- 
nary sequences not otherwise used to encode in- 
dividual symbols. Compression is effected to the 
degree that the symbol combinations represented 
in the encoding scheme are encountered in a given 
text or other file. By analogy with bilingual dic- 
tionaries used to translate between human lan- 
guages, the device that embodies the mapping of 
uncompressed code into compressed code is com- 
monly referred to as a "dictionary". 

The present invention is primarily applicable to 
dictionary-based compression schemes, which are 
part of a larger class of sequential compression 
schemes. These are contrasted with non-sequential 
schemes which examine an entire file before deter- 
mining the encoding to be used. Other sequential 
compression schemes, such as run-length limited 
(RLL) compression, can be used in conjunction 
with adaptive schemes. 

Generally, the usefulness of a dictionary-based 
compression scheme is dependent on the frequen- 
cy with which the symbol-combination entries in 
the dictionary are matched as a given file is being 
compressed. A dictionary optimized for one file 
type is unlikely to be optimized for another. For 
example, a dictionary which includes a large num- 
ber of symbol combinations likely to be found in 
newspaper text files is unlikely to compress effec- 
tively data base files, spreadsheet files, bit-mapped 
graphics fifes, computer-aided design files, Musical 
Instrument Data Interface (MIDI) files, etc. 

Thus, a strategy using a single fixed dictionary 
might be best tied to a single application program. 
A more sophisticated strategy can incorporate 
means for identifying file types and selecting 
among a predetermined set of dictionaries accord- 
ingly. Even the more sophisticated fixed dictionary 
schemes are limited by the requirement that a file 
to be compressed must be matched to one of a 
limited number of dictionaries. Furthermore, there 
is no widely accepted standard for identifying file 



types essentially limiting multiple dictionary 
schemes to specific applications or manufacturers. 

Adaptive compression schemes are known in 
which the dictionary used to compress a given file 
5 is developed as that file is being compressed. 
Entries are made into a dictionary as symbol com- 
binations are encountered in the file. The entries 
are used on subsequent occurrences of an en- 
coded combination. Compression is effected to the 

10 extent that the symbol combinations occurring 
most frequently in the file are encountered as th 
dictionary is developing. Systems incorporating 
adaptive compression schemes can include means 
for clearing the dictionary between files so that the 

is dictionary can be adapted on a file-by-file basis. 

Adaptive compression systems and methods 
are disclosed in U.S. Patent No. 4,464,650 to East- 
man et al. and U.S. Patent No. 4,558,302 to Welch. 
These references explain further the use of dic- 

20 tionaries in both adaptive and non-adaptive com- 
pression strategies. Further pertinent references to 
compression strategies include: G. Herd, "Data 
Compression: Techniques and Applications - Hard- 
ware and Software Considerations, Wiley, 1983; 

25 R.G. Gallager, "Variations on a Theme of Huff- 
man", IEEE Transactions on information Theory, 
Vol. IT-24, No. 6, pp. 668-674, November 1978; J. 
Ziv and A. Lempel. "A Universal Algorithm for 
Sequential Data Compression", IEEE Transactions 

30 on Information Theory, Vol. IT-23, No. 3, pp. 337- 
343, May 1977; J. Ziv and A. Lempel. 
"Compression of Individual Sequences via Variable 
Rate Coding", IEEE Transactions of Information 
Theory, Vol. IT-24, No. 5, pp. 530-536, September 

35 1978; and T.A. Welch, "A Technique for High Per- 
formance Data Compression", IEEE Computer, 
June 1984. 

A disadvantage of such adaptive compression 
techniques is that in sme cases they can expand 

40 rather than compress the data. In fact expansion is 
the rule rather than the exception when an adaptive 
compression scheme is used to compress a file 
which has already been compressed .by that 
scheme. As data compression becomes more 

45 widely employed, the chances of data expansion 
due to an attempted compression of a previously 
compressed file increases. For example, an ap- 
plication program can include a dedicated com- 
pression scheme so that files created by the pro- 

50 gram can be stored efficiently on a hard disk drive. 
Likewise, a tape drive system for backing up a 
hard disk can include a data compression scheme 
in hardware for more efficient archiving of the hard 
disk drive. In this situation, attempting data com- 
pression during archiving can result in data expan- 
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sion rather than contrac 

As data compressioil^BComes more common, 
this counterproductive scenario becomes less the 
exception and more the rule. If data compression is 
to b implem nted in hardware so that it operates 
irrespective of the -type of data being compressed, 
it becomes necessary to protect against uninten- 
ded data expansion. Of course, this protection must 
not interfere with the process of decompression 
that must occur upon the reception or retrieval of 
compressed data. 



SUMMARY OF THE INVENTION 



The present invention provides expansion pro- 
tection in a data compression system by selecting 
the shorter of 1) the "raw" data as received by the 
system, and 2) the "processed" data as processed 
by an incorporated data compressor. A data com- 
pression system in accordance with the present 
invention includes a data compressor, a control 
function, and one or more pairs of buffers. Each 
pair of buffers includes a buffer for raw data and a 
buffer for processed data When the raw data buff- 
er is first to fill, data is transmitted from the pro- 
cessed data buffer, and vice-versa. 

As just indicated, when a processed data buffer 
is first to fill, the contents of the raw data buffer are 
to be transferred. Prior to this transmission, a code 
can be injected to indicate that the data following is 
raw data. Upon reception or retrieval of the data 
file, this code can be used by a decompression 
system to determine when decompression is re- 
quired and when it is not. Additional codes can be 
used to indicate the resumption or non-resumption 
of data compression. 

Accordingly, the present invention provides for 
expansion protection with minimal performance 
overhead. This greatly improves the commercial 
viability of general purpose hardware implementa- 
tions of adaptive compression schemes. In addi- 
tion, the invention is applicable to compression 
schemes other than those using adaptive dictio- 
naries. Other features and advantages of the preset 
invention are apparent from the description below 
with reference to the following drawings. 



BRIEF DESCRIPTION OF THE DRAWINGS 



FIGURE 1 is a data compression system 
with expansion protection in accordance, with the 
present invention. 



FIG^^K is an alternative data compression 
system wit^^xpansion protection in accordance 
with th present invention. 



DESCRIPTION OF THE PREFERRED EMBODI- 
MENTS 



70 A data compression system includes a data 
compressor 11, a "processed data" first-in first-out 
memory (FIFO) 13, a "raw data" FIFO 15, a code 
injector 17 and a controller 19. Raw data from a 
host system is received along a "data in" line 101 

75 of the data compression system and is directed to 
an input 103 of compressor 11 and a data input 
105 of raw data FIFO 15 concurrently. The data, as 
processed by the compressor 11, is entered into 
the processed data FIFO 13. 

20 As long as compressor 11 is effectively com- 

pressing received data, the data path through the 
data compression system includes compressor 1 1 
processed data FIFO 13, and the data out line 107 
of the data compression system. If compressor 1 1 

25 is expanding rather than compressing data over a 
sufficiently long data block, processed data FIFO 
13 is filled before raw data FIFO 15. In this case, a 
full indication from an output 109 of processed data 
FIFO 13 to an input 111 of controller 19 caus s th 

30 latter to transmit a signal from an output 113 to an 
input 115 to activate code injector 17 so that a 
code indicating that raw data follows is injected into 
the output data stream through line 107. Next, 
controller 19 transmits from an output 117 via a 

35 bus 119 to an output enable port 121 to signal raw 
data FIFO 15 to transmit raw data via data out line 
107 of the data compression system. 

The illustrated data compressor 1 1 employs an 
adaptive compression strategy, using a dictionary 

40 with both preassigned and assignable codes. The 
preassigned codes are associated with individual 
symbols expected in the data stream. Certain sym- 
bol combinations, selected according to the adap- 
tive strategy, are assigned codes as the combina- 

45 tions are encountered in the raw data stream from 
the host system. At least 90% of the four thousand 
codes available in the dictionary of compressor 1 1 
are assignable to symbol combinations. An as- 
signed code is used to translate the respective 

so symbol combination on subsequent occurrences of 
that combination in the symbol stream. 

FIFOs 13 and 15 are similar, each having a 
data input port 125, 105, a data output port 127, 
123, an input enable port 131, 129, an output 

55 enable port 133, 131, a clear input 137, 135, an 
empty indicator 141, 139, an almost full indicator 
145, 143, and a full indicator 149, 147. Each of the 
FIFOs receives data via its data input port 125, 
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105, when it is enabled. Eacn^^p output port 127. 
123, when enabled, can transmit the contents of 
the respective FIFO 13, 15 via data out line 107 
from the data compression system. For each FIFO 
13. 15, the respective input enabl 131, 129 and 5 
the respective output enable 133, 121 are used to 
enable the respective data input port 125, 105 and 
the respective data output port 127, 123. Each 
clear input 137, 135, when activated, sets a pointer 
in the respective FIFO 13, 15 to indicate that no io 
data is contained therein. When no data is con- 
tained in a FIFO 13, 15, the respective empty 
indicator 141, 139 is activated. When a FIFO 13, 15 
is full, the respective full indicator 149, 147 is 
activated. Provision is also made to indicate an 75 
almost full condition at the respective almost full 
indicator 145, 143. The illustrated FIFOs 13 and 15 
have 1 kilobyte capacities. 

Code injector 17 can inject from its output 151 
a predetermined code into the data output stream 20 
in response to a signal at its enable port 115. This 
predetermined code is selected to be distinguish- 
able from any of the codes in the dictionary of 
compressor 11 assigned to individual symbols or 
symbol combinations in the data input stream. 25 
Thus the injected code can be identified on recep- 
tion or retrieval as an indicator for a switch from 
processed data to raw data in the data stream. A 
decompression system can thus respond accord- 
ingly upon reception or retrieval of the compressed 30 
file. 

Controller 19 coordinates the activities of the 
components of the data compression system so 
that compressed data is output therefrom as long 
as compressor 11 is compressing data and so that 35 
raw data is output therefrom once compressor 1 1 
is found to be expanding received data. Using 
unillustrated connections, controller 19 provides 
handshaking with a host system as data is being 
received along data in line 101 and transmitted via 40 
data out line 107. Controller 19 also controls com- 
pressor 11 so that it receives data in coordination 
with the transmission of data by the host system. 

The interface between compressor 11 and pro- 
cessed data FIFO 13 is also managed by controller 45 
19. which controls via a bus 153 from an output 
155 an output 157 of compressor 11 and data input 
port 125 of processed data FIFO 13, via its input 
enable port 131. Concurrently, controller 19 con- 
trols, using its output 117 and bus 119, the recep- so 
tion of raw data by the raw data FIFO 15 via its 
input enable port 129. 

Further details of operation are apparent from 
the following typical sequence of events. It is as- 
sumed that the dictionary has been reset prior to 55 
the reception of data from the host system. As data 
is received along data in line 101, it is entered into 
raw data FIFO 15 and processed by compressor 
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11, the output SWJhich is entered into processed 
data FIFO 13. Initially, the compression ratio of 
compressor 11 will b n ar 1:1, i.e., a raw data 
segment is about as long as the corresponding 
processed data segment. 

As symbol combinations are encountered in 
the stream, some of them are selected according 
to the data compression strategy for assignment to 
assignable codes in the dictionary. As these com- 
binations are re-encountered later in the stream, a 
single output code represents the raw data symbol 
combinations, effecting at least some data com- 
pression. 

Assuming that the symbol combinations en- 
coded in the adapted dictionary appear relatively 
frequently in the file being processed, raw data 
FIFO 15 fills more quickly than processed data 
FIFO 13, Accordingly, full indications from output 
147 of raw data FIFO 15 are transmitted along bus 
159 and received at an input 161 of controller 19. 
Controller 19 can then inhibit further compression 
activity and data entry into the FIFOs. 

After receiving a full indication, controller 19 
activates the output enable 133 of processed data 
FIFO 13 and the data contained therein is transmit- 
ted via data out line 107. Controller 19 signals 
compressor 11 to stop transmission as soon as a 
word boundary is reached. In the meantime, con- 
troller 19 activates clear input 135 of raw data FIFO 
15, which confirms an empty condition with an 
empty indication via empty port 139. Once pro- 
cessed data FIFO 13 has transmitted all its data, it 
sends an empty indication to controller 19 which 
then requests further data from the host system 
and activates the input enables 129, 131 of the two 
FIFOs. The process repeats until compressor 11 
starts expanding data rather than compressing 
data. 

In the even compressor 11 begins to expand 
data, processed data FIFO 13 fills before raw data 
FIFO 15. In this case, an almost full indication from 
processed data FIFO 13 is sent to controller 19. 
Controller 19 then activates injector 17 which in- 
serts the predetermined switch code into the data 
output stream. Controller 19 then activates the out- 
put enable 121 of raw data FIFO 15 so that raw 
data follows the inserted code. Controller 19 then 
clears processed data FIFO 13. 

The system can be configured for a one-time 
switch to raw data transmission or to alternate 
between raw and processed data transmissions 
within a single data file. In the one-shot configura- 
tion, once the. switch is made to raw data, raw data 
is transmitted until the file is completely transmit- 
ted. In this case, controller 19 simply forwards 
input data to the output line 107 via raw data FIFO 
15. 

In the alternating configuration, compressor 11 
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continues to operate an^te output sourc is se- 
lected according to wh^^PlFO fills first. In this 
case, controller 19 activates code injector 17 at the 
beginning and end of each block of raw data trans- 
mitted. Thus, if upon reception or retrieval two 
consecutive injector codes are detected, this in- 
dicates that raw data continues. If only one inj ctor 
code is detected, the following bits can be taken to 
represent compressed data. 

A second embodiment of the present invention 
incorporates two pairs of FIFOs which operate in 
shifts so that data compression need not be de- 
layed while output is effected. In addition to a data 
compressor 15, a controller 59, four FIFOs 53, 54, 
55 and 56, and a code injector 57, includes a pair 
of 1:2 demultiplexer switches 61 and 63 to shift 
between one pair of FIFOs and the other. 

In this second data compression system, data 
is received along a data input line 201 and trans- 
mitted via a data output line 203. Data compressor 
51 receives input data at an input 205 and trans- 
mits process data via its output 207. Data compres- 
sor 51 also includes a control port 209 for commu- 
nication with controller 59. 

Demultiplexer 61 includes an input 21 1 for re- 
ceiving data as processed by data compressor 51 , 
a first output 213 for transmitting data to FIFO 53, a 
second output 215 for transmitting data to FIFO 54, 
and a control input 61 for receiving select signals 
from controller 59. Demultiplexer 63 includes an 
input 219 fro receiving data via data in line 201, a 
first output 221 for transmitting data to FIFO 55, a 
second output 223 for transmitting data to FIFO 56 t 
and a control input 225 for receiving select signals 
from controller 59. 

FIFOs 53 and 54 include respective data inputs 
233 arid 234 coupled to demultiplexer 61 . FIFOs 55 
and 56 include respective data inputs 234 and 236, 
each coupled to demultiplexer 63. FIFOs 53, 54, 55 
and 56 include respective data outputs 243, 244, 
245, 246 ail coupled to data out line 203. Each 
FIFO also includes a respective control port 253, 
254, 255, 256 for communication with controller 59. 
Code injector 57 includes a control input 261 and a 
code output 263. 

Controller 59 includes a port 271 for bi-direc- 
tional communication with data compressor 51. 
Controller 59 also has an output 273 for transmit- 
ting a select signal to demultiplexer 61 and an 
output 275 for transmitting a select signal to demul- 
tiplexer 63. Controller 59 additionally includes ports 
277 and 279 got bi-directional communication with 
FIFOs 53, 54, 55 and 56. An output 281 is used to. 
activate code injector 57. 

In operation, raw data is received concurrently 
by compressor 51 and the upper 1:2 switch 63. 
Switches 61 and 63 are set by controller 59 to 
direct data to the lower respective FIFO, 53, 55. 



Thus raw ^^veceived at upper switch 63 is input 
to the low^Kw data FIFO 55 and the output of 
compressor 51 is directed to the lower processed 
data FIFO 53. Once one of the two lower FIFOs 53, 

5 55 are full, controller 59 changes the routing, at 
switches 61 and 63 so that the next block of data is 
input to tipper .FIFOs 54 and 56. This switching is 
timed to permit any necessary word boundary 
cleanup by compressor 51. During this filling, the 

io contents of the appropriate lower FIFO 54, 56 is 
transmitted via the data out line 203. Code injector 
57 is activated as necessary to mark raw data as in 
the previous embodiment. The main advantage of 
this arrangement is that the data input stream does 

75 not have to be interrupted during transmission of 
the selected raw or compressed data out the data 
out line 203. 

The present invention provides for many vari- 
ations on the foregoing embodiments. A switch to 

20 raw data can be made to occur at different thresh- 
olds by adjusting the relative lengths of the FIFOs 
or by permitting the almost full indicator to be 
programmed to different lengths. The code injector 
can be a separate device or integrated into another 

25 component. For example, the raw data FIFO can 
include a read only memory preset so that the 
leading bits or the leading and the trailing bits of 
the raw data FIFO contents are the code identifying 
the transmitted data as raw data. The controller can 

30 include a provision for injecting the code itself or 
from the dictionary with a provision for bypassing 
the processed data FIFO. 

In embodiments with provisions for FIFOs to be 
used in shifts, the switching can be performed by a 

35 variety of means. In some embodiments one buffer 
can belong to two or more buffer pairs. For exam- 
ple, a data compression system can include one 
buffer which is always used as the processed data 
buffer, while two raw data buffers are used in 

40 alternation. These and other modifications to and 
variations of the foregoing embodiments are pro- 
vided by the present invention, the scope of which 
is limited only by the following claims. 

'45 

Claims 

1 . A system comprising: 
system input means (101) for receiving raw data; 

so system output means (107) for transmitting data; 
a data compressor (11) for providing processed 
data, said compressor (11) including a compressor 
input (103) for receiving raw data and a compres- 
sor output (157) for transmitting processed data, 

55 said compressor input (103) being coupled to said 
system input means (101); 

processed data buffer means (13) for storing pro- 
cessed data from said compressor (11), said pro- 
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cessed data buffer means (13j^^ing a processed 
data buffer input (1 25) coupled to said compressor 
output (157) and a processed data buffer output 
(127) coupled to said system output (107), said 
processed data buffer means (13) including pro- 5 
cess data buffer full indicator means (149) for. pro- 
viding a process data buffer full indication when a 
predetermined amount of processed data is stored 
therein; 

raw data buffer means (15) for storing raw data, io 
said raw data buffer means (15) having a raw data 
buffer input (105) coupled to said system input 
means' (101) and a raw data buffer output (123) 
coupled to said system output means (107), said 
raw data buffer means (15) including raw data full 75 
indicator means (147) for providing a raw data 
buffer full indication when a predetermined amount 
of data is stored therein; and 

controller means (1 9) for controlling the flow of raw 
and processed data between said system input 20 
means (101) and said system output means (107), 
said controller means (19) including means (111 
and 116) for receiving full indications from said 
buffers (15 and 13) and means (117 and 115) for 
selectively enabling the outputs of each of said 25 
buffer means (15, 17) so that the contents of said 
processed data buffer means (13) can be transmit- 
ted via said system output means (107) in re- 
sponse to a full indication from said raw data buffer 
means (15) and so that the contents of said raw 30 
data buffer means (15) can be transmitted via said 
system output means (107) in response to a full 
indication from said processed data buffer means 
(13). 

2. The system of Claim 1 further comprising: 35 
code injection means (1 7) for outputting a predeter- 
mined code sequence via said system output 
means (107), said code injection means (107) hav- 
ing an injection output (151) coupled to said sys- 
tem output means (107), said controller means (19) 40 
including means (113) for activating said code in- 
jection means (17) so as to transmit a code iden- 
tifying data transmitted from said raw data buffer 
means (15) as raw data. 

3. The system of Claim 1 wherein: 45 
each of said two buffer means (13 and 15) includes 

a first-in-first-out memory device (also 13 and 15). 

4. The system of Claim 1 wherein: 

each of said two buffer means includes a pair of 
first-in-first-out memory devices (53. 54, and 55, so 
56) and means for filling said memory devices in 
alternation (61 and 63). 

5. A method comprising: 
receiving raw data ; 

concurrently storing said raw data in a first memory 55 
device (15) and processing (at 11) said raw -data 
according to a predetermined compression strat- 
egy and storing the resulting processed data in a 
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second memory device (13); 
d termining which of said memory devices (13, 15) 
is the first to fill; and 

transmitting the contents of the other memory de- 
vice (15, 13). 

6. The method of Claim 5 further comprising: 
transmitting a code to mark the transmitted con- 
tents as raw data when said second memory de- 
vice (13) is the first to fill. 
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